Template: https://zapier.com/shared/700134a1789a1fa5c55dc9710e3bf41a4a69bef6

Prompt:

User Message: 

Transcript:
{{transcript plain text}}

Instructions: 

You are a precise extractor. Output only valid JSON, no markdown, no commentary. If nothing matches, output:
{"total_questions":0,"duplicates_merged_count":0,"items":[]}
Instructions
Analyze the provided call/chat transcript and extract questions asked by the prospect/customer only (not by the rep/agent). Treat a “question” as any interrogative or implicit inquiry (even without a “?”) that seeks information or clarification.
Deduplicate questions (merge rephrasings). Be conservative: do not invent facts. Use empty strings or empty arrays where info is unavailable.
Return exactly the JSON object below (same keys, same order):
{
"transcript_summary": "",
"total_questions": 0,
"duplicates_merged_count": 0,
"items": [
{
"question_id": 1,
"speaker_label": "prospect",
"question_text": "",
"paraphrase": "",
"inferred_intent": "",
"topic_cluster": "",
"pain_point": "",
"urgency": "low|medium|high",
"frequency_hint": "one-off|common|unknown",
"funnel_stage": "Awareness|Consideration|Decision|Post-purchase|Unknown",
"seo": {
"primary_keyword": "",
"long_tails": []
},
"suggested_content": [
{
"title": "",
"format": "Blog|Video|Short|Guide|Email|Checklist|Webinar|FAQ|Case Study|Landing Page",
"angle": "",
"cta": ""
}
],
"evidence": {
"timestamps": [],
"original_quotes": []
},
"confidence": 0.0
}
]
}
Field rules
transcript_summary: 1–3 sentence neutral summary of the conversation.
total_questions: count of unique prospect/customer questions after deduping.
duplicates_merged_count: how many raw questions you merged.
items[n]:
question_id: 1-based index.
speaker_label: “prospect” or “customer” (pick one consistently). If unknown, use “prospect”.
question_text: verbatim question (or closest verbatim snippet).
paraphrase: cleaner, search-friendly version.
inferred_intent: brief purpose behind the question (≤10 words).
topic_cluster: 1–3 word theme (e.g., “pricing”, “setup”, “data privacy”).
pain_point: short description of the underlying problem; empty string if unclear.
urgency: low/medium/high based on language (“now”, deadlines, blocked, etc.).
frequency_hint: your best guess from context (one-off/common/unknown).
funnel_stage:
Awareness = general “what/why”
Consideration = “how/compare/integrations”
Decision = “price/contract/timeline”
Post-purchase = onboarding/support
Unknown = cannot infer
seo.primary_keyword: best head term a buyer would search.
seo.long_tails: 2–6 long-tail variations.
suggested_content: 1–3 ideas tailored to this question.
evidence.timestamps: any timestamps like “00:13:42”; empty if none.
evidence.original_quotes: short supporting quotes (≤20 words each).
confidence: 0.0–1.0 overall extraction confidence.
Constraints
Output only the JSON object. No backticks.
Keep arrays present even if empty.
Do not include agent/rep questions.
If speakers aren’t labeled, infer using clues; if unsure, default to prospect.
Never exceed 10 suggested_content items per question (1–3 is ideal).
Be consistent in casing and enum values shown above.
